Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

نویسندگان

  • Alexander Gouberman
  • Markus Siegle
چکیده

State-based systems with discrete or continuous time are often modelled with the help of Markov chains. In order to specify performance measures for such systems, one can define a reward structure over the Markov chain, leading to the Markov Reward Model (MRM) formalism. Typical examples of performance measures that can be defined in this way are time-based measures (e.g. mean time to failure), average energy consumption, monetary cost (e.g. for repair, maintenance) or even combinations of such measures. These measures can also be regarded as target objects for system optimization. For that reason, an MRM can be enhanced with an additional control structure, leading to the formalism of Markov Decision Processes (MDP). In this tutorial, we first introduce the MRM formalism with different types of reward structures and explain how these can be combined to a performance measure for the system model. We provide running examples which show how some of the above mentioned performance measures can be employed. Building on this, we extend to the MDP formalism and introduce the concept of a policy. The global optimization task (over the huge policy space) can be reduced to a greedy local optimization by exploiting the non-linear Bellman equations. We review several dynamic programming algorithms which can be used in order to solve the Bellman equations exactly. Moreover, we consider Markovian models in discrete and continuous time and study value-preserving transformations between them. We accompany the technical sections by applying the presented optimization algorithms to the example performance models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping Activity Diagram to Petri Net: Application of Markov Theory for Analyzing Non-Functional Parameters

The quality of an architectural design of a software system has a great influence on achieving non-functional requirements of a system. A regular software development project is often influenced by non-functional factors such as the customers' expectations about the performance and reliability of the software as well as the reduction of underlying risks. The evaluation of non-functional paramet...

متن کامل

Modeling and Evaluation of Stochastic Discrete-Event Systems with RayLang Formalism

In recent years, formal methods have been used as an important tool for performance evaluation and verification of a wide range of systems. In the view points of engineers and practitioners, however, there are still some major difficulties in using formal methods. In this paper, we introduce a new formal modeling language to fill the gaps between object-oriented programming languages (OOPLs) us...

متن کامل

Modeling and Evaluation of Stochastic Discrete-Event Systems with RayLang Formalism

In recent years, formal methods have been used as an important tool for performance evaluation and verification of a wide range of systems. In the view points of engineers and practitioners, however, there are still some major difficulties in using formal methods. In this paper, we introduce a new formal modeling language to fill the gaps between object-oriented programming languages (OOPLs) us...

متن کامل

Self-Improving Factory Simulation using Continuous-time Average-Reward Reinforcement Learning

Many factory optimization problems, from inventory control to scheduling and reliability , can be formulated as continuous-time Markov decision processes. A primary goal in such problems is to nd a gain-optimal policy that minimizes the long-run average cost. This paper describes a new average-reward algorithm called SMART for nd-ing gain-optimal policies in continuous time semi-Markov decision...

متن کامل

On $L_1$-weak ergodicity of nonhomogeneous continuous-time Markov‎ ‎processes

‎In the present paper we investigate the $L_1$-weak ergodicity of‎ ‎nonhomogeneous continuous-time Markov processes with general state‎ ‎spaces‎. ‎We provide a necessary and sufficient condition for such‎ ‎processes to satisfy the $L_1$-weak ergodicity‎. ‎Moreover‎, ‎we apply‎ ‎the obtained results to establish $L_1$-weak ergodicity of quadratic‎ ‎stochastic processes‎.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012